Reactive Reinforcement Learning in Asynchronous Environments
نویسندگان
چکیده
منابع مشابه
Reactive Reinforcement Learning in Asynchronous Environments
The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored. Frequently used models of the interaction between an agent and its environment, such as Markov Decision Processes (MDP) or Semi-Markov Decision Processes (SMDP), do not capture the fact that, in an asynchronous environment, the state of the environment may change during computation per...
متن کاملAsynchronous Methods for Deep Reinforcement Learning
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neura...
متن کاملAdaptive Content Presentation in Asynchronous Learning Environments
In Adaptive Educational Hypermedia Systems (AEHS), we expect that the learning content presentation should be appropriately retrieved from learning object repositories, and dynamically tailored to each learner’s needs. Each learner has a profile, subject to continuous change. The basic components of the learner’s profile include his/her cognitive characteristics, background of knowledge, previo...
متن کاملGeneralization over Environments in Reinforcement Learning
We give a method to optimize single-agent behavior for several environments and reinforcement functions by learning in several environments simultaneously in [5]. Now we address the problem of learning in one and applying the policy obtained to other environments. We discuss the influence of the environment on the ability to generalize over other environments. How do good learning environments ...
متن کاملInverse Reinforcement Learning in Partially Observable Environments
Inverse reinforcement learning (IRL) is the problem of recovering the underlying reward function from the behaviour of an expert. Most of the existing algorithms for IRL assume that the expert’s environment is modeled as a Markov decision process (MDP), although they should be able to handle partially observable settings in order to widen the applicability to more realistic scenarios. In this p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Frontiers in Robotics and AI
سال: 2018
ISSN: 2296-9144
DOI: 10.3389/frobt.2018.00079